The Rate Loss of Single-Letter Characterization: 
The "Dirty" Multiple Access Channel 

Tal Philosof and Ram Zamir t 

Dept. of Electrical Engineering - Systems, Tel-Aviv University 

Tel- Aviv 69978, ISRAEL 

talp, z.amir@eng.tau.ac.il 
00 ■ 

<^ • Submitted to IEEE Trans, on Information Theory March 2008 

8 



(N 



Abstract 

For general memoryless systems, the typical information theoretic solution - when exists - has a "single-letter" 
form. This reflects the fact that optimum performance can be approached by a random code (or a random binning 
scheme), generated using independent and identically distributed copies of some single-letter distribution. Is that the 
form of the solution of any (information theoretic) problem? In fact, some counter examples are known. The most 
famous is the "two help one" problem: Korner and Marton showed that if we want to decode the modulo-two sum 
of two binary sources from their independent encodings, then linear coding is better than random coding. In this 
fN| \ paper we provide another counter example, the "doubly-dirty" multiple access channel (MAC). Like the Korner- 

Marton problem, this is a multi-terminal scenario where side information is distributed among several terminals; 
fT) ■ each transmitter knows part of the channel interference but the receiver is not aware of any part of it. We give an 

o 

qq ' explicit solution for the capacity region of a binary version of the doubly-dirty MAC, demonsttate how the capacity 
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region can be approached using a linear coding scheme, and prove that the "best known single-letter region" is 
strictly contained in it. We also state a conjecture regarding a similar rate loss of single letter characterization in 
the Gaussian case. 

Index Terms 

Multi-user information theory, random binning, linear lattice binning, dirty paper coding, lattice strategies, 
Korner-Marton problem. 

I. Introduction 

Consider the two-user / double-state memoryless multiple access channel (MAC) with transition and state 
probability distributions 

P(y\x 1 ,x 2 ,si,s 2 ) and P(s 1 ,s 2 ), (1) 
^This research was partially supported by BSF grant No-2004398 
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respectively, where the states Si and S2 are known non-causally to user 1 and user 2, respectively. A special case 
of © is the additive channel shown in Fig. Q] In this channel, called the doubly-dirty MAC (after Costa's "writing 
on dirty paper" [1]), the total channel noise consists of three independent components: Si and S2, the interference 
signals, that are known to user 1 and user 2, respectively, and Z, the unknown noise, which is known to neither. 
The channel inputs X\ and X2 may be subject to some average cost constraint. 

Neither the capacity region of £T|) nor that of the special case of Fig. Q] are known. In this paper we consider 
a particular binary version of the doubly-dirty MAC of Fig. [Q where all variables are in Z2, i.e., {0, 1}, and the 
unknown noise Z = 0. The channel output of the binary doubly-dirty MAC is given by 

Y = X 1 ®X 2 (BSi(BS2, (2) 

where © denotes the mod 2 addition (xor), and Si, S2 are Bernoulli(l/2) and independent. Each of the codewords 
Xj £ is a function of the message Wi and the interference vector s, £ 7L\\ and must satisfy the input constraint, 
^iOil(xj) < Qi, i = 1, 2, where < qi, q2 < 1/2 and wh(-) is the Hamming weight. The coding rates Ri and R2 
of the two users are given as usual by Ri = ^ log |Wj|, where Wj is the set of messages of user i, and n is the 
length of the codeword. 

The double state MAC (Q]) generalizes the point to point channel with side information (SI) at the transmitter 
considered by Gel' f and and Pinsker [2]. They prove their direct coding theorem using the framework of random 
binning, which is widely used in the analysis of multi-terminal source and channel coding problems [3]. They 
obtain a general capacity expression which involves an auxiliary random variable U : 

C= max {H(U\S)-H{U\Y)} (3) 

P(u,x\s) 

where the maximization is over all the joint distributions of the form p(u, s, y, x) = p{s)p(u, x\s)p(y\x, s). 

The channel in £0) with only one informed encoder (i.e., where £2 = {0}) was considered recently by Somekh- 
Baruch et al. [4] and Kotagiri and Laneman [5]. The common message (Wi = W2) capacity of this channel is 
known [4], and it involves using random binning by the informed user. For the binary "one dirty user" case (i.e., 
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(0 with S 2 = 0), we show that Somekh-Baruch's common-message capacity becomes (see Appendix IB 

Ccom = H b ( qi ), (4) 

where H b (x) = — x log 2 (a;) — (1 — x) log 2 (l — x) is the binary entropy function. Clearly, the doubly-dirty individual- 
message case is harder. Thus, it follows from (0]) that the rate-sum in the setting of Fig. Q] is upper bounded by 

R 1 +R 2 <mm{H b (q 1 ),H b (q 2 )y (5) 

In Theorem Q] we show that this upper bound is in fact tight. 

One approach to find achievable rates for the doubly-dirty MAC, is to extend the Gel'fand and Pinsker solution 
[2] to the two-user / double-state case. As shown by Jafar [6], this extension leads to the following pentagonal 
inner bound for the capacity region of CD): 

K(Ui,U 2 ) = !^R 1 ,R 2 ):Ri<I(U 1 ,Y\U 2 )-I(U 1 ;S 1 ) 

R 2 <I(U 2 ,Y\U 1 )-I(U 2 ;S 2 ) (6) 

Ri + R 2 < I(U U U 2: Y)-I{U 1 ;S 1 )- I(U 2 ; S 2 )| 

for some P(Ui,U 2 , Xi, X 2 \Si, S 2 ) = P(Ui, Xi\S\)P(U 2 , X 2 \S 2 ). In fact, by a standard time-sharing argument 
[3], the closure of the convex hull of the set of all rate pairs (R±, R 2 ) satisfying ©, 

n BS L = clconv Uri,R 2 )£K{U 1: U 2 ) -.PiU^X^X^SuS^ = P(U 1 ,X 1 \S 1 )P(U 2 ,X 2 \S 2 )\, (7) 

is also achievabldj. To the best of our knowledge, the set TZbsl is the best currently known single-letter charac- 
terization for the rate region of the MAC with side information at the transmitters (Q]), and in particular, for the 
doubly-dirty MAC (|2]q The achievability of ([TJ can be proved, as usual, by an i.i.d random binning scheme [6]. 

A different method to cancel known interference is by "linear strategies", i.e, binning based on the cosets of a 
linear code [8], [9], [10]. In the sequel, we show that the outer bound © can indeed be achieved by a linear coding 
scheme. Hence, the set of rate pairs R 2 ) satisfying is the capacity region of the binary doubly-dirty MAC. 
In contrast, we show that the single-letter region {/J is strictly contained in this capacity region. Hence, a random 
binning scheme based on this extension of the Gel'fand-Pinsker solution [2] is not optimal for this problem. 

A similar observation has been made by Korner-Marton [11] for the "two help one" source coding problem. 
For a specific binary version known as the "modulo-two sum" problem, they showed that the minimum possible 

1 As in the Gel'fand and Pinsker solution, for a finite alphabet system it is enough to optimize over auxiliary variables Ui and U2 whose 

alphabet size is bounded in terms of the size of the input and state alphabets. 

2 For the case where the users have also a common message Wo to be transmitted jointly by both encoders, Q can be improved by 

adding another auxiliary random variable Uo which plays the role of the common auxiliary r.v. in Marton's inner bound for the non-degraded 

broadcast channel [7]. In this case, the joint distribution of (Uo,Ui,Ua) is given by P(U , Ui, U 2 ) = P(Uo)P(Ui\U )P(U 2 \Uo), i.e, Ui 

and U2 are conditionally independent given Uo- 
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rate sum is achieved by a linear coding scheme, while the best known single-letter expression for this problem is 
strictly higher. See the discussion in [11, Section IV] and in the end of Section ITTll 

Although the "single-letter characterization" is a fundamental concept in information theory, it has not been gen- 
erally defined [12, p. 35]. Csiszar and Korner [13, p. 259] suggested to define it through the notion of computability , 
i.e., a problem has a single-letter solution if there exists an algorithm which can decide if a point belongs to an 
e-neighborhood of the achievable rate region with polynomial complexity in 1/e. Since we are not aware of any 
other computable solution to our problem, we shall refer to (0 as the "best known single-letter characterization" . 

An extension of these observations to continuous channels would be of interest. Costa [1] considered the single- 
user case of the dirty channel problem Y = X + S + Z, where the interference S and the noise Z are assumed to 
be i.i.d. Gaussian with variances Q and N, respectively, and the input X is subject to a power constraint P. He 
showed that in this case, the transmitter side-information capacity ([3]) coincides with the zero-interference capacity 
\ log 2 (l + SNR), where SNR = P/N. Selecting the auxiliary random variable U in ([3]) such that 

U = X + aS, (8) 

where X and S are independent, and taking a = P f N , the formula ([3]) and its associated random binning scheme 
are capacity achieving. The continuous (Gaussian) version of the doubly-dirty MAC of Fig. Q] was considered in 
[10]. It was shown that by using a linear structure, i.e., lattice strategies [8], the full capacity region is achieved 
in the limit of high SNR and high lattice dimension. In contrast, it was shown that for Q — > oo no positive rate is 
achievable by using the natural generalization of Costa's strategy ([8]) to the two user case, while a (scalar) modulo 
addition version of ([8]) looses « 0.254 bit in the sum capacity. We shall further elaborate on this issue in Section [TV] 

Similar observations regarding the advantage of modulo-lattice modulation with respect to a separation based 
solution were made by Nazer and Gastpar [14], in the context of computation over linear Gaussian networks, and 
also by Krithivasan and Pradhan [15] for multi-terminal rate distortion problems. 

The paper is organized as follows. In Section JI] the capacity region for the binary doubly-dirty MAC © is 
derived, and linear coding is shown to be optimal. Section [III] develops a closed form expression for the best 
known single-letter characterization $7} for this channel, and demonstrates that it is strictly contained in the the 
true capacity region. In Section JV] we consider the Gaussian doubly-dirty MAC, and state a conjecture regarding 
the capacity loss of single-letter characterization in this case. 

II. The Capacity Region of the Binary Doubly-Dirty MAC 
The following theorem characterizes the capacity region of the binary doubly-dirty MAC of Fig. Q] 

Theorem 1. The capacity region of the binary doubly-dirty MAC (O is the set of all rate pairs -R2) satisfying 

C{qi,q 2 ) = |CR1,-R 2 ) :Ri + R 2 <mm{H b ( qi ),H b (q 2 )}^. (9) 

Proof: The converse part: As explained in the Introduction ©, one way to derive an upper bound for the 
rate-sum is through the general one-dirty-user capacity formula [4], which we derive explicitly for the binary case 
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in Appendix J] Here we show directly the converse part, which is similar to the proof of the outer bound for the 
Gaussian case in [16], [10]. We assume that user 1 and user 2 intend to transmit a common message W. An upper 
bound on the rate of this message clearly upper bounds the sum rate R\ + R 2 in the individual messages case. 
Thus, 

n(Ri + R 2 ) < H(W) 

= H(W\Y n ) +I(W;Y n ) 

<I(W;Y n )+ne n (10) 
= H(Y n ) -H(Y n \W) +ne n 

= H(Y n ) - H(Y n \W, SJ*, S%) - I(S?, S%; Y n \W) + ne n 

= H(Y n ) - I(S[\S2;Y n \W) + ne n (11) 

= H(Y n ) - H(S%, S%\W) + H(S?, S%\W, Y n ) + ne n 

< -n + H(S?\W,Y n ) + H(S%\W,Y n , S?) + ne n (12) 

< H(X[ l ®X%® S?\W, Y n , + ne n (13) 
= H(X%\W 7 Y n ,S^)+ne n (14) 

< nH b (q 2 ) + ne n , (15) 

where ([TOl follows from Fano's inequality where e n — > as the error probability goes to zero for n — ► oo; 
(ITTb follows since Y is fully known given W, S\ and ^2; (TT2l follows from the chain rule for entropy, and due 
to H(Y n ) < n and H{&t,S%\W) = H{S[ l ) + H(S%) = 2n since W, and S r 2 l are mutually independent; (O 
follows since H(S?\W, Y n ) < n and Y n = Xf 5f © 5^; d follows since Xf is a function of (W, ), 
finally fl5) follows since iI(X^|W, Y n , Sf ) < < nH b (q 2 ). 

In the same way we can show that R\ + R 2 < H b {qi) + e n . The converse part follows since for n — > cxo we 
have that e n 0, thus P e (n) 0. 

r/ie direct part is based on the scheme for the point-to-point binary dirty paper channel [9]. We define q = 
min{(7i, q 2 }. In view of the converse part, it is sufficient to show achievability of the point (Ri,R 2 ) = (H b (q),0), 
since the outer bound may be achieved by time sharing with the symmetric point (Ri,R 2 ) = (0, Hb(q)). The corner 
point (Ri,R 2 ) = (Hb(q),0) corresponds to the "helper problem", i.e., user 2 tries to help user 1 to transmit at its 
highest rate. The encoders and decoder are described using a binary linear code C(n,k) with parity check matrix 
H. Let v G r L 2 1 ~ k be a syndrome of the code C, where we note that each syndrome represents a different coset of 
the linear code C. Let /(v) denote the "leader" of (or the minimum weight vector in) the coset associated with 
the syndrome v [17, Chap. 6], hence / : {0, l} n ~ fc — > {0, l} n . For a G ZJ, we define the n-dimensional modulo 
operation over the code C as 

a mod C = f(H&), 
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which is the leader of the coset to which the vector a belongs. 

• Encoder of user 1: Let the transmitted message vi G 7U^~ k be a syndrome in C, and let xi = /(vi) be 
its coset leader. In particular vi = Hxi. Transmit the modulo of the code C with respect to the difference 
between xi and s±, i.e., 

xi = (xi © si) mod C = /(vi © Hsi). 

• Encoder of user 2: (functions as a "helper" for user 1). Transmit 

x 2 = s 2 mod C = f(Hs 2 ). 

• Decoder: 

1. Reconstruct xi by xi = y mod C. 

2. Reconstruct the transmitted coset of user 1 by vi = HZ.\. 

In fact, the transmitted coset can be reconstructed directly as vi = i/xi = H(y mod C) = Hy, where the 
last equality follows since y mod C and y are in the same coset. 
It follows that the decoder correctly decodes the message coset vi, since 



Vi = H ■ [ y mod C 

= H ■ f [xi © si © s 2 © si © s 2 ] mod C 
= flxi 

= Vi, 

where the third equality follows since xi and xi mod C are in the same coset. It is left to relate the coding rate 
Ri = ^ log ( {0, l} n_fc ^ = 1 — fc/n to the input constraint Form [18], there exists a binary linear code with 
covering radius p that satisfies - < 1 — H^p/n) + e where e — ► as n — > oo. The achievability of the point 
(-&&(<?), 0) follows by using g = p/n, thus i?i = 1 — fc/n > Hb(q) — e, while wh(*-i) = w//(/(vi © Hs\)) < p 
and %(x2) = WH(f(Hs2)) < p, hence 

-^u;h{xi} = -Ew H {f(yi © ffsi)} < g 
n re 

-^jj{x 2 } = -Ew H {f(Hs 2 )} < q. 
re re 

This completes the proof of the direct part of the theorem. □ 

As stated above, the achievability for the capacity region follows by time sharing the corner points (Hb(q),0) and 
(0, Hb(q)) where q = m.in{qx, q 2 }. It is also interesting to see how to achieve the rate sum Hb(q) for an arbitrary 
rate pair R 2 ) without time sharing. For that, let the message of user 1 be mi £ Z^ 1 and the message of user 
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2 be rri2 G Z 2 2 where Zi + Z2 = n — A;. We define the following syndromes in C 

vi 4 [ mi 0_0^_0] G Z"" fc 

v 2 4 [0 . m 2 ] G KT fc 

V = Vi © v 2 . 

Clearly, given the syndrome v the syndromes vi and v 2 are fully known and the messages mi and m 2 as well. 
Let Xj = /(vj) be the coset leader of Vj for i = 1, 2. In this case the transmission scheme is as follow: 

• Encoder of user 1: transmit xi = (xi © si) mod C = /(vi © Hs\). 

• Encoder of user 2: transmit X2 = (x 2 © s 2 ) mod C = /(v 2 © -ffs 2 ). 

• Decoder: reconstruct v = H ■ (y mod Cj . 
Therefore, we have that 

v = H ■ fy mod C 

= iJ • ( Xi © X 2 ] = Vi © V 2 = V. 



The sum capacity is achieved since Ri + i? 2 = = — - > Hf,(q) — e where e — > as n — > 00 which satisfies 
the input constraints. 

III. A Single-Letter Characterization for the Capacity Region 

In this section we characterize the best known single-letter region (O for the binary doubly-dirty MAC (0, 
and show that it is strictly contained in the capacity region (©. For simplicity, we shall assume identical input 
constraints, i.e., q\ = g 2 = q. 

Definition 1. For a given q, the best known single-letter rate region for the binary doubly-dirty MAC (HJ), denoted 
by TZbsl(q)> ^ the set of all rate pairs i? 2 ) satisfying ((T) with the additional constraints that -EXi, i?X 2 < q. 

In the following theorem we give a closed form expression for 1Zbsl(q)- 

Theorem 2. The best known single-letter rate region for the binary doubly-dirty MAC (HJ) is a triangular region 



given by 

n BSL {q) = |(i?l,i?2) :Ri + R 2 < u.c.e\2H h {q) - 1 
where u.c.e is the upper convex envelope with respect to q, and [x] + = max{0,rr}. 



(16) 



Fig. |2] shows the sum capacity of the binary doubly-dirty MAC © versus the best known single-letter rate sum 
(fT6l ) for equal input constraints. The latter is strictly contained in the capacity region which is achieved by a linear 
code. The quantity [2Hf,(q) — 1] + is not a convex - D function with respect to q. The upper convex envelope of 
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(17) 



[2H b (q) — 1] + is achieved by time-sharing between the points q = and q = q* = 1 — l/y/2, therefore it is given 
by 

f 2tf 6 (g) - 1, (?*<</< 1/2 
I C*g, < q < q* 

where C * 4 H^M. 

Proof: The direct part is shown by choosing in © f/i = 5i © Xi and C^j = £2 © where X±,X 2 
Bernoulli^) and Xi, X 2 , S±, S 2 are independent. From (O the achievable rate sum is given by 

Ri + R 2 = I{U X ,U 2 - Y) - I(U U U 2 ;S 1 ,S 2 ) 
= + H{U 2 \S 2 ) - H(U U U 2 pi® U 2 ) 

= H(Ui\Si) + H(U 2 \S 2 ) - H{U X \U X © U 2 ) - H(U 2 p! © U 2 ,Ui) 
= H{X X ) + H(X 2 ) - HiUip! © U 2 ) 
= 2H b (q) - 1, 



(18) 
(19) 
(20) 
(21) 

where (TT81) follows since Y = U± © U 2 ; ( fl9l ) follows from the chain rule for entropy; (T20b follows since U 2 is fully 
known given Ui © f7 2 > Ui thus iS"(Z7 2 1 C^i © U 2 , U\) = 0; (ED follows since iifpQ) < H b (q) and since t/i, t/ 2 are 
independent with P(Ui = 1) = 1/2 thus fl"(L/i|E/i © U 2 ) = H(Ui) = 1. 

77ie converse part of the proof is given in Appendix HH □ 
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We see that the binary doubly-dirty MAC is a memoryless channel coding problem, where the capacity region 
is achievable by a linear code, while the best known single-letter rate region is strictly contained in the capacity 
region. This may be explained by the fact that each user has only partial side information, and distributed random 
binning is unable to capture the linear structure of the channel. 

In order to understand the limitation of random binning versus a linear code, we consider these two schemes 
for high enough q, that is 2H\ ) {q) — 1 > 0. The random binning scheme uses U{ = Xi © Si where Xi ~ 
Bernoulli((/) and Si ~ Bernoulli (1/2) are independent, therefore Y = U\ © U 2 where U{ ~ Bernoulli(l/2) for 
i = 1,2. Each transmitter maps the message (bin) Wi into a codeword u; which is with high probability at a 
Hamming distance of nq from Sj. Therefore, given the vectors (s^Sg), the available input space is approximately 
2 nH(u 1 ,u 2 \s 1 ,s 2 ) = 2nH(X!,x 2 ) = 2 2nH »{i) . Given the received vector y, the residual ambiguity is given by 
2 nH(u u u 2 \Y) _ 2 n[H(u 1 \Y)+H(u 2 \Y,u 1 )} _ 2 n since H(Ut\Y) = 1 and H(U 2 \Y, Ui) = 0. As a result, the achievable 
rate sum is given by 



R 1+ R 2 = ho g J , - '7 ut h spac f e| , 

n V |residual ambiguity space| 



2H b {q) - 1. 



The linear coding scheme shown in Theorem Q] has the same input space size as the random binning scheme, i.e., 
2 2nHb (i\ since each user has 2 nHb ^ cosets. However, given the received vector y there are 2 nHb ^ q ' possible pairs 
of cosets, i.e., the residual ambiguity is only 2 nHh ^ q \ Therefore, the linear code achieves rate sum of R\ + R 2 ~ 
2Hb(q) — H}y{q) = Hf,(q). The advantage of the linear coding scheme results from the "ordered structure" of the 
linear code, which decreases the residual ambiguity from 1 bit in random coding to Hf,(q). 

The following example illustrates the above arguments for the case that user 2 is a "helper" for user 1, i.e, 
i?2 = 0, and user 1 transmits at his highest rate for each technique (random binning or linear coding). Table U 
summarizes the rates and codebooks sizes for each user for q = 0.3, that is Hb(q) « 0.88 bit. 





Random binning 


Linear code 


Rate sum 


2H b (q) - 1 = 0.76 bit 


H b (q) = 0.88 bit 


Codewords per bin/coset 


2-nHV i ;S i ) _ 2«[1-Hi,(g)] _ 2 - 12 " 


2 n[l-Hf,(q)] _ 2 0.12n 


Helper (user 2) - codebook size 


2nI(U 2 ;S 2 ) _ 2»I 1 -»f>(«)] — 2°' 12n 


2 »[1-H ll (<l)] _ 2 0.12n 


User 1 - codebook size 


20.76ri20.12n _ o0.88n 


20.12n20.88n _ 2n 


Number of possible codeword pairs 


20.88n20.12n _ 2n 


2n20.12n 2l-12n 



TABLE I 

Random binning and linear coding schemes codebooks sizes for the helper problem with q = 0.3. 



Korner and Marton [11] observed a similar behavior for the "two help one" source coding problem shown in 
Fig. [3] In this problem, there are three binary sources X, Y, Z, where Z = X © Y, and the joint distribution of X 
and Y is symmetric with P(X 7^ Y) = 9. The goal is to encode the sources X and Y separately such that Z can 
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Fig. 3. The Komer-Marton configuration. 



be reconstructed losslessly. Korner and Marton showed that the rate sum required is at least 

R x + R y >2H(Z), (22) 

and furthermore, this rate sum can be achieved by a linear code: each encoder transmits the syndrome of the 
observed source relative to a good linear binary code for a BSC with crossover probability 6. 

In contrast, the "one help one" problem [19], [20] has a closed single-letter expression for the rate region, which 
corresponds to a random binning coding scheme. Korner and Marton [11] generalize the expression of [19], [20] 
to the "two help one" problem, and show that the minimal rate sum required using this expression is given by 

Rx + Ry > H(X,Y). (23) 

The region (l23l corresponds to Slepian-Wolf encoding of X and Y, and it can also be derived from the Burger-Tung 
achievable region [21] for distributed coding for X and Y with one reconstruction Z under the distortion measure 
d(X, Y, Z) = X®Y®Z. Clearly, the region © is strictly contained in the Korner-Marton region R x +R y > 2H(Z) 
(EU) (since H(X, Y) = 1 + H(Z) > 2H(Z) for Z ~ Bernoulli^), where 9 / ±). For further background on related 
source coding problems, see [15]. 

IV. The Gaussian Doubly-Dirty MAC 

In this section we introduce our conjecture regarding the rate loss of the best known single-letter characterization 
for the capacity region of the two-user Gaussian doubly-dirty MAC at high SNR. The Gaussian doubly-dirty MAC 
[10] is given by 

Y = X 1 + X 2 + Si + S 2 + Z, (24) 

where Z ~ N(0,N) is independent of X\,X 2 ,S\,S 2 , and where user 1 and user 2 must satisfy the power 
constraints, ^ Y17=i -^h — ^ an< ^ n — ^ see Fig- CD The interference signals Si and S 2 are known 

non-causally to the transmitters of user 1 and user 2, respectively. We shall assume that Si and S 2 are independent 
Gaussian with variances going to infinity, i.e., Si ~ M(0,Qi) where Qi — > oo for i = 1,2. The signal to noise 
ratios for the two users are SNR± = -S- and SNR 2 = 



1 1 



The capacity region at high SNR, i.e., SNR 1 ,SNR 2 > 1, is given by [10], 

Ri + ^bto ( min{ ^ P2} ), (25) 

and it is achievable by a modulo lattice coding scheme of dimension going to infinity. In contrast, it was shown in 
[10] that at high SNR and strong independent Gaussian interferences, the natural generalization of Costa's strategy 
([8]) for the two users case, i.e., with auxiliary random variables U\ = X\ + S\ and Ui = X 2 + S 2 , is not able to 
achieve any positive rate. A better choice for U\ and U 2 suggested in [10] is a modulo version of Costa's strategy 
®, 

U* = [Xi + St] mod Ai, (26) 

where A» = y/ 12 Pi, and where Xi ~ Unif ([—■%■, %■)) is independent of Si, for i = 1,2. In this case the rate loss 
with respect to <|25]> is \ log 2 « 0.254 bit. 

The best known single-letter capacity region for the Gaussian doubly-dirty MAC (l24l is defined as the set of 
all rate pairs (Ri,R 2 ) satisfying $7}, where X\ and X2 are restricted to the power constraints EXf < P± and 
EX\ < P2. We believe that for high SNR and strong interference, the modulo-A strategy (1261 ) is an optimum 
choice for (X\,X 2 , U±, U2) in (O for the Gaussian doubly-dirty MAC. This implies the following conjecture about 
the rate loss of the best known single-letter characterization. 

Conjecture 1. For the Gaussian doubly-dirty MAC, at high SNR and strong interference, the best known single-letter 
expression R s qsl © looses 

C sum _ jjgg* = 1 lQg2 ^ „ 254 bitt (27) 

w/f/i respect to the sum capacity C sum (1251 ). 

Note that the right hand side of d27T ) is the well known "shaping loss" [22] (equivalent to a 1.53dB power loss). 

A heuristic approach to attack the proof of this conjecture is to follow the steps of the proof of the converse 
part in the binary case (Theorem [2). First, in Lemma [6] we derive a simplified single-letter formula, G max (Pi, P2), 
which is analogous to Lemma Q] in the binary case. The next step would be to optimize this expression. However, 
an optimal choice for the auxiliary random variables V\, V{, V 2 , V 2 ' (provided in the binary case by Lemma [2] and 
Lemma [3) is unfortunately still missing for the Gaussian case. The expression in Lemma [6] is close in spirit to the 
point-to-point dirty tape capacity for high SNR and strong interference [8]. In [8] it is shown that optimizing the 
capacity is equivalent to minimum entropy-constrained scalar quantization in high resolution, which is achieved 
by a lattice quantizer. Clearly, if we could show a similar lemma for the two variable pairs in the maximization 
of Lemma [6l i.e., that it is achieved by a pair of lattice quantizers, then the conjecture would be an immediate 
consequence. 

It should be noted that the above discussion is valid only for strong interferences S± and S 2 . For interference 
with finite power, it seems that cancelling the interference part of the time and staying silence the rest of the time 
(like in the time-sharing region < q < q* in the binary case) may achieve better rates. 
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V. Summary 

A memoryless information theoretic problem is considered open as long as we are missing a general single-letter 
characterization for its information performance. This goes hand in hand with the optimality of the random coding 
approach for those problems which are currently solved. We examined this traditional view for the memoryless 
doubly-dirty MAC. 

In the binary case, we showed that the best known single letter characterization is strictly contained in the region 
achievable by linear coding, and that the latter is in fact the full capacity region of the problem. In the Gaussian 
case, we conjectured that the best known single-letter characterization suffers an inherent rate loss (equal to the 
well known "shaping loss" 0.5 log(7re/6)), and we provide a partial proof. This is in contrast to the asymptotic 
optimality (dimension — > oo) of lattice strategies, as recently shown in [10]. 

The underlying reason for these performance gaps is that random binning is in general not optimal when side 
information is disttibuted among more than one terminal in the network. In the specific case of the doubly-dirty 
MAC (like in Korner-Marton's modulo-two sum problem [11] and similar settings [14], [15]), the linear structure 
of the network allows to show that linear binning is not only better, but it is capacity achieving. 

Appendix I 

A Closed Form Expression for the Capacity of the Binary MAC with One Dirty User 
We consider the binary dirty MAC © with S 2 = 0, 

Y = X 1 ®X 2 ®S 1 , (28) 

where Si ~ Bernoulli(l/2) is known non-causally at the encoder of user 1 with the input constraints ^Wh{*-i) < Qi 
for i = 1, 2. We show that the common message {W\ = W 2 = W) capacity of this channel is given by 

C C om = H b (qi). (29) 

To prove d29l , consider the general expression for the common message capacity of the MAC with one informed 
user [4], given by 

C com = max {I(Ui, X 2 ; Y) - I(U X , X 2 ; Si) }, (30) 

Ui,Xi,X 2 

where the maximization is over al the joint distributions 

P(S 1 ,X 1 ,X 2 ,U ll Y) = P(S 1 )P(X 2 )P(U 1 \X 2 ,S 1 )P(X 1 \S u U 1 )P(Y\X 1 ,X 2 ,S 1 ). 
The converse part of d29l follows since for any U\, X\, X 2 , the common message rate i? com can be upper bounded 
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by 

Rcom = I(U 1 ,X 2 ;Y)-I(U 1 ,X 2 ;S 1 ) 

= H(S 1 \U 1 ,X 2 ) - H{Y\U 1 ,X 2 ) + H(Y) - ff(Si) 

<H(S\U 1 ,X 2 )-H(Y\U 1 ,X 2 ) (31) 

= H(S 1 \U 1 ,X 2 )-H(X 1 ®S 1 \U 1 ,X 2 ) (32) 

= H(S 1 \T)-H(X 1 (BS 1 \T) (33) 

= E T lH(S 1 \T = t) - H{X X ® Si\T = t)\ (34) 

= E T {H b (a t ) - H b ((3 t )}, (35) 

where CD} follows since H(Y) < 1 and H(Si) = 1; 03 follows since Y = Xi © X 2 Si; © follows 
the definition T = (Ui,X 2 ); d34l ) follows from the definition of the conditional entropy; 051 ) follows from the 
following definitions a t = P(Si = 1|T = t) and & = P(Si © X x = 1|T = i) for any t £ T. We also define 
guij = P(Xi = 1|T = i) = £?{Xi|T = i}, therefore the input constraint of user 1 can be written as 

EX X = E T E{X 1 \T = t} = E T {q llt } < gi . (36) 

Without loss of generality, we can only consider a t , fit, Qi\t £ [0, 1/2] in (1331 ) for any t G T. Thus, 

Ream < E T {H b (a t ) - H b ([a t - qi \ t ] + )} (37) 

< E T {H b (q llt )} (38) 

< H b (E T {q llt }^j (39) 

< H b ( gi ), (40) 

where (1371 ) follows from (1351 ) and since H b ((3 t ) > H b [[at — q\\t] + ^j, where [x] + = max{x,0}; (T38T ) follows since 
H b {a t ) — H b ([at — Qi\t] + ^j is increasing in a t for at < < 1/2 and decreasing in a t for < aj < 1/2, thus 
the maximum is for at = qi\t', d39l follows from Jensen's inequality since H b (-) is convex-Pi; d40l follows from the 
input constraint for user 1 d36l ). The converse part follows since the outer bound is valid for any U\ and Xi,X 2 
that satisfy the input constraints. 

The direct part is shown by using U\ = X\ © Si where X\ and Si are independent with X\ ~ Bernoulli(gi), 
thus C/i ~ Bernoulli (1/2). Furthermore, X 2 ~ Bernoulli^) which is independent of X\,U\,S\. In this case 
Y = U\ © X 2 , hence Y ~ Bernoulli(l/2). Using this choice for Ui,X\,X 2 , the achievable common message rate 
is given by 

Ream = I{U U X 2 ;Y) - I(U U X 2 - Si) 

= H(Si\Ui,X2) - H{Y\U U X 2 ) + H(Y) - ff(Si) 

= (41) 

= ff 6 («i), 
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where (SB follows since H(S 1 \U 1 ,X 2 ) = H(Si\Ui) = H(X X ), H(Y\U 1 ,X 2 ) = 0, H(Y) = 1 and H(Si) = 1. 

Appendix II 
Proof of the Converse Part of Theorem [2] 

The proof of the converse part follows from Lemma [T] Lemma [2] and Lemma [3l whereas Lemma [5] and Lemma [4] 
are technical results which assist in the derivation of Lemma [3] 
Let us define the following functions: 



F(P\A V'< -Pi 



V 1 ,V{i-^V 2 ,V^ 



(42) 



H{y{) + H(V 2 ) - H(V{ © Vi) - 1 

where [x] + = max(0, x); its (q±, q 2 ) -constrained maximization with respect to Vi,V{,V 2 ,V 2 6 Z2 where (Vi,Vi) 
and (V 2 ,V 2 ) are independent, i.e., 

F ma x(qi,q2) = max ^(JFVx.V', -FVa.V') (43) 

v u v(y 2 yi 

s.t P(y 4 /y/)<%, for i = 1,2; 

and the upper convex envelope of F max (qi,q 2 ) with respect to 91,52 

Fmax{qi,q2) = u.c.e^F max (q 1 ,q 2 )y (44) 

In the following lemma we give an outer bound for the single-letter region (O of the binary doubly-dirty MAC in 
the spirit of [23, Lemma 3] and [8, Proposition 1]. 

Lemma 1. The best known single-letter rate sum ([7]) of the binary doubly-dirty MAC (O with input constraint q\ 
and q 2 is upper bounded by 

Rl + R2<F max (q 1 ,q 2 ). (45) 
Proof: An outer bound on the best known single-letter region (0 is given by 

Rbsl(Ui, U 2 ) 4 [j(I7i, U 2 ; Y) - I(U U U 2 ; S h S 2 )] + (46) 
= H(S 1 \U 1 ) + H(S 2 \U 2 )-H(Y\U 1 ,U 2 )+H(Y)-H(S 1 )-H(S 2 )] + (47) 
HiS^Ui) + H(S 2 \U 2 ) - H(Y\U U U 2 ) - 1 



< 



E UuU2 {f(5i|C/! = m) + H(S 2 \U 2 = u 2 ) - H(Y\U! =u u U 2 = u 2 ) - l} 



< E UuU A H{Sx\U x = u x ) + H{S 2 \U 2 = u 2 ) - H{Y\U X = u u U 2 = u 2 ) - 1 



< E u x ,u 2 <F{P Sl:Sl(sXl{Ul=Ul , Ps 2 ,S*®X 2 \U 2 =u 2 

< Fu 1 ,U 2 \F rri ax(qi\u 1 iq2\u 2 )} 

< F max (E Ul q l \ Ul , E U2 q 2 \ U2 ) 

< F m ax(qi,q2), 



(48) 
(49) 

(50) 

(51) 

(52) 
(53) 
(54) 
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where d48l ) follows since H(S\) = #(62) = 1 and H(Y) < 1; ( |49T > follows from the definition of the conditional 
entropy; (l50l ) follows since < i?{:z; + }; (f5TT > follows from the definition of the function F(Py u v^, Pv 2 ,v^) 

(l42l) . likewise (|52~1) follows from the definition of the function F max (qi,q2) (l44l ). and from the definition 

= / © Si\U t = m) = P(X t = l\Ui = m), for i = 1,2; 

d53j follows from Jensen's inequality since F max (qi,q2) is a concave function; (l54l ) follows from the input 
constraints where 

£JQ = E Ui P(X t = l\Ui = Ui) 

= ]T P{ui)P(Xi = l\Ui = Ui) 

Ui&Ji 

= ^ P{ui)qi\ Ut < qu for i = 1,2. (55) 

The lemma now follows since the upper bound (l54l ) for the rate sum is independent of U\ and U2, hence it also 
bounds the single-letter region 1Zbsl{(1)- □ 
A simplified expression for the function F max (qi,q 2 ) of (|43T ) is shown in the following lemma. 



Lemma 2. The function F max (qi,q2) d43l > is given fry 



: (?i,92)= max #&(a<i) + F 6 (a 2 ) - H b [ai - <?i] + * [a 2 - 52]" 

a 1 ,a 2 e[0,l/2] 



(56) 



where * is the binary convolution, i.e., x * y = (1 — x)y + (1 — y)x. 

Proof: The function F max (q\, q 2 ) is defined in (l42l and (|43T > where Vf, V/, V2, V^' are binary random variables. 
Let us define the following probabilities: 

on = P(Vi = 1) 

St = P{V( = \\Vi = 0) 

7 i = p^' = m = 1), 

for z = 1, 2. We thus have 

P(V7 = 1) = (1 - ai)Si + Oi(l - 7i ) = <?(«*, *i7i) 
W ^ = «*7; + (1 - a*)** = 7i), 
for i = 1, 2. The maximization (|43T ) can be written as 

F rnax (q 1 ,q 2 ) = max. H b {a{) + H b (a 2 ) - min iT 6 (#(ai, 61, 71) * g(a 2 , S 2 , 72)) - 1 • (57) 



Oil 



h(cti,5i,ii)<qi, i=l,2 



This maximization has two equivalent solutions (aj,^) and (1 — a°, 1 — 01%) where < a°,«2 < 0-5, since any 
other (ai,a 2 ) can only increase the inner minimization in (I57T ) which results in a lower F max (qi, q 2 ). Therefore, 
without loss of generality we may assume that < a\, a 2 < 0.5. 
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To prove the lemma we need to show that for any on the inner minimization is achieved by 

5i = 0,7i = min{l, ?»/«»}, i = 1,2. 

In other words, V( has the smallest possible probability for 1 under the constraint that P(Vi / V() < qi, implying 
that the transition from Vi to V( is a "Z channel". The inner minimization requires that P{V( = 1) will be minimized 
restricted to the constraint P(Vi / V() < qi, therefore it is equivalent to the following minimization 

min g{oti,5ni), i = 1,2. 

h(cii,(5i7i)<gi 

For a, < g, the solution is Si = and ji = 1 since in this case g(a.i,%, Si) = and the constraint is satisfied. 
For g < on < 0.5, in order to minimize g{cti,%,Si), it is required that Si G [0, q/(l — a*)] will be minimal and 
7i G [0, g/aj] will be maximal such that the constraint is satisfied. Clearly, the best choice is for Si = and 
7i = g/ai, in this case the constraint is satisfies and g(ai,ji, Si) = 014 — q. □ 
The next lemma gives an explicit upper bound for F max (qi,q2) (l43l ) for the case that q\ = qi. Let 

f(x) = x -i -2, (58) 

1+^-1 



,c 



and let 

q c = max /(a;). (59) 

se[o,i/2] 

Since f(x) is differentiable, we can characterize q c by differentiating f(x) with respect to x and equating to zero, 
thus we get that 

4x 4 - 8x 3 + 10x 2 - 6x + 1 = 0. 

This fourth order polynomial has two complex roots and two real roots, where one of its real roots is a local 
minimum and the other root is a local maximum. Specifically, this local maximum maximizes f(x) for the interval 
x G [0, 1/2] and it achieves q c ~ 0.1501 which occurs at x ~ 0.257. 

Lemma 3. For gi = qi = q, we have that: 

F max (q, q) = 2H b {q) - 1, g c < q < 1/2 

iW^) <C*9, 0<q<q c (60) 

iW0,0) =0, g = 0, 

where q c is defined in d59l . w/i/Ze C* = 2gb ^» ant/ = 1 — l/\/2 ~ 0.3 are defined in (fTTT ). 

Note that in the first case (q c < q < 1/2) in d56l ) is achieved by «i = «2 = Q, while in the third case (q = 0) (l56l ) 
is achieved by a,\ = a% = 1/2 as shown in Fig. [5] Although, we do not have an explicit expression for F max (q, q) 
in the range < q < q c , the bound F max (q,q) < C*q is sufficient for the purpose of proving Theorem [2] because 
Qc < Q*- In Fig- S] a numerical characterization of F max (q,q) is plotted. 
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Fig. 4. Numerical results of F max (q, q) HUl for q G [0, 0.12] (Fig.|2]is the same plot for q £ [0, 0.5]) 



F(a>i,a 2 , 



< 



(61) 



Proof: Define 

F(ai,a 2 ,q) = H b (a>i) + H b (a 2 ) - H b ([ai - q} + * [a 2 - q} + ) - 1. 

From the discussion above about the cases of equality in (l60l) . Lemma [3] will follow by showing that F(a\,a 2 , q) 
is otherwise smaller, i.e., 

C*q, 0<q<q c 
2H b (q) - 1, q c < q < 1/2 

for all < ai, ol 2 < 1/2. It is easy to see that for a\, a 2 < q the function F(a\, a 2 , q) is monotonically increasing 
with a±,a 2 , and thus F(a\,a 2 ,q) < F(q,q,q) = 2H b (q) — 1. For a± < q and q < a 2 < 1/2, F(ai,a 2 ,q) 
is increasing with a\ and decreasing with a 2 , and thus F(ai,a 2 ,q) < F(q,q,q) = 2H b (q) — 1. Clearly, from 
symmetry, also for a 2 < q and q < a\ < 1/2, F(a±, a 2 , q) < 2H b (q) — 1. As a consequence, we have to show 
that doTT l is satisfied only for q < a±, a 2 < 1/2. Likewise, in the sequel we may assume without loss of generality 
that q < a 2 < ai < 1/2. 

The bound for the interval q c < q < 1/2: in this case ( f6TT ) is equivalent to the following bound 



H b ({ai - (?) * (a 2 - g)) - H b ( ai ) - H b (a 2 ) + 2H b (q) > 0, for q c < q < a 2 < a x < 1/2. 



(62) 
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Fig. 5. The optimal ai = Q2 = oe(q) which maximizes 



The LHS is lower bounded by 

H h ({ai -q)* (or 2 - q)) ~ H b (ai) - H b (a 2 ) + 2H b {q) 

> H h {a x -q)- H b (ax) - H b (a 2 ) + 2H b (q) (63) 
>H b (a 1 -q)-2H b (a 1 )+2H b (q) (64) 

> 0, (65) 

where (l63l follows since H b (ia.\ — q) * (a 2 — q)J > H b (a\ — q); (l64b follows since a 2 < «i < 1/2; (|65T ) follows 
from Lemma @] below. 

T/ie bound for the interval < q < g c : in this case doTt is equivalent to the following bound 



^(("i - q) * ("2 - q)) > H b (ai) + H b (a 2 ) - 1 - C* ■ q, for < q < a 2 < ct x < q c . (66) 
For fixed ot\ and a 2 , let us denote the RHS and the LHS of d66l ) as 

9i{q) - H h Ua\ -q)* (a 2 - q) 
g r (q) 4 fl- 6 ( ai ) + F 6 (a 2 ) - 1 - C* ■ g. 

The function gi(q) is convex-Pi in q, since it is a composition of the function H b (x) which is non-decreasing 
convex-n in the range [0, 1/2] and the function [ax — q] * [a 2 — q] which is convex-n in q [24]. Since g r (q) is 
linear function in q and gi(q) is convex-n function in q, the bound (l66l ) is satisfied if the interval edges (q = and 
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q = a 2 ) satisfy this bound. For q = 0, ( f66T > holds since 

gi(q = 0) = H b (ai * a 2 ) 

> max{F fe (ai),# fe (a 2 )} 

> min{H b (ai), H b (a 2 )} 

> H h {ax) + H b (a 2 ) - 1 
= 9r(q = 0). 

For q = a 2 where < q < q c , the bound (l66l ) is satisfied since 

g r (q = a 2 ) = H b (a.i) + H b (a 2 ) -1- C* ■ a 2 (67) 

< H b ( ai ) - H b (q*) + H b (0.5q*) - 0.5 (68) 

< H b (ai) - H b (q c ) (69) 

< H b { ai ) - H b (a 2 ) (70) 

< H b (ai - a 2 ) (71) 
= 9l{q = OL2), (72) 

where d68j follows from Lemma |5] since argmax Q2g [ ^/ 2 ] ^("2) = 0.5g*, and since C* = 2Hbi p~ 1 ;^ follows 
since for q* = 1-1/V2 and q c defined in §9$, we have i? 6 (l-l/\/2)-iJ 6 (0.5(l-l/\/2))+0.5 ~ 0.68... > H b (q c ); 
d70l follows since (7 C > Q2, thus H b {q c ) > H b (a 2 )', (TTTT t follows since H b {a\) — H b {ot\ — a 2 ) is decreing in «i, 
thus H b (a±) — H b (a± — a 2 ) < H b (a 2 ) for a 2 < a\ < 1/2. Therefore, the bound (|66l ) follows which completes the 
proof. □ 
Lemma |4] and Lemma [5] are auxiliary lemmas used in the proof of Lemma [3] 

Lemma 4. For q c < q < ot\ < 1/2, f/ie following inequality is satisfied 

4 iJ 6 ( ai - g) - 2fl- 6 (oi) + 2H b (q) > 0. (73) 

Proof: Since /i(ai = 5) = 0, it is sufficient to show that /i(ai) is non-decreasing function in a\, i.e., 

^7/1(01) > for q c < q < a x < 1/2, therefore 

-^-/i(ai) = log 2 (— l) - 21og 2 (— - l) > 0. (74) 

doi\ \a\ — q / \a.\ / 

Due to monotonicity of the log function d74l is equivalent to 

q > "i — ra = (75) 

i + (i-0 

where /(•) was defined in ((58]). Since by the definition of q c d59l /(x) < g c Vx G [0,1/2], it follows that 
< <? V qi if gc < g, and in particular for g c < q < a\, which implies (|75T ) as desired. □ 
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Lemma 5. Let 

f 2 {x)=H b (x)-l-C* -x, (76) 
where x € [0, 1/2], and C* = 2Hb ^, where q* = 1 — l/\/2- maximum of f 2 {x) is achieved by 

argmax/ 2 (x) = 0.5q* = -(1 - 1/^2). (77) 
Proof: By differentiating /2(x) with respect to x and comparing to zero, we get that 

= ^-/ 2 (x) = log 2 ( — ) - C*, (78) 



dx \ x 

thus x° = 2 c' + i maximizes f 2 (x) since the second derivative is negative, i.e., ^ f 2 {x)\ x=x o < 0. The lemma is 
followed since x° = 2 c* +1 =0-5q*. □ 

We are now in a position to summarize the proof of Theorem [2] 
Proof of Theorem |2] - Converse Part. The rate sum is upper bounded by 

R 1 + R 2 < u.c.e{F max (q,q)} (79) 

{ C* -q, < q < q c ) 

<u.c.e{ / (80) 

{ 2H b {q) - 1, q c < q < 1/2 J 

= u.c.e{[2fr 6 (<z)-l]+}, (81) 

where d79l follows from Lemma [Q (f80l) follows from Lemma [3j and (|8TT ) follows since (l80l is equal to the upper 
convex envelope of [2iffe(g) — 1] + . 

Appendix III 

A simplified Outer Bound for the Sum Capacity in the Strong Interference Gaussian Case 

Lemma 6. The best known single-letter sum capacity (0 of the Gaussian doubly-dirty MAC (124b with power 
constraints P\, P 2 , and strong interferences (Qi,Q 2 — > oo) is upper bounded by 

R 1 + R 2 <u.c.e\ sup \h(V 1 ) + h{V 2 )-h(V{ + V^ + Z)+h(S 1 + S 2 )-h(S 1 )-h{S 2 )} + \, (82) 

I V u V(,V2,Vi 1 J J 

where u.c.e is the upper convex envelope operation with respect to P\ and P% and [x] + = max(0, x). The supremum 
is over all V±, V{, V 2 , V 2 such that (V±, V{) is independent of (V 2 , V^'), and 

£{(^-^/) 2 }<Pi, 
h(Vi) < h(Si), 

for i = 1,2. 

Proof: Let us define the following functions (corresponds to F(Pv u vj, Pv 2 ,vj) of (l42l ) 

G{M,Vi,fv a ,vi) = iKVt) + h(V 2 ) - h(V( + V{ + Z)+ h(S l + S 2 ) - h{S x ) - h(S 2 )} + . (83) 
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The second function is the following maximization of ( f83T > with respect to V±, V{, V2, V 2 '. 



Gmax(Pl,P2) = SUp G(f Vl ,V{, fv 2 ,Vi) 

v u v(y 2 yi 



(84) 



s.t E[[V t - Vj) 2 } < Pu h{Vi) < h(Si), for i = 1,2. 
Finally, we define the upper convex envelope of G max (P\ , P2 ) with respect to Pi and P2 : 

G max {P u P 2 ) = u.c.e[G max {P 1 ,P 2 )]. (85) 
Clearly if we take only the rate sum equation in © we get an outer bound on the best known single-letter region, 

RfgiiUu U 2 ) = [l(U lt U 2 ; Y) - I{U X , U 2 ; S 1 ,S 2 )] + (86) 
/i(Si|l/i) + h(S 2 \U 2 ) - h{Y\U u U 2 ) + h(Y) - h(S!) - h(S 2 )] + (87) 
/i(5i|C/i) + h(S 2 \U 2 ) - h(Y\U u U 2 ) + M^i + S 2 ) - /i(Si) - h(S 2 )} + + o(l) 



< 



Eu^hiStlUt = m) + h(S 2 \U 2 = U2) - h(y\U x = u u U 2 = u 2 ) + + S 2 ) - fc(Si) - h(S 2 )} 



< E UuU A h(Si\Ui = «i) + h(S 2 \U 2 = u 2 ) - h{X x + Si + X 2 + 5 2 + Z|E7i = «i, C/ 2 = u 2 ) 



(88) 
(89) 



+ h(Si + S 2 ) - h(S{) - h(S 2 ) >+o(l) 



< 



Pu 1 ,u 2 yGyf SuSl+Xl \u 1 =u 1 fs 2 ,s 2 +x 2 \u 2 =u, 

{G max {Pl\ Ul ,P2\u 2 )}+o(l) 



+ o(l) 



< Gma^iEu.P^Eu.P^) + O(l) 

< G max (P 1 ,P 2 ) +o(l), 



(90) 

(91) 

(92) 
(93) 
(94) 



where dSHJ follows since /i(Y) < h(Si + S2) + o(l) where o(l) — > as Q\,Q2 — >• 00; ([89]) follows from the 
definition of the conditional entropy; (|90l follows since [Px] + < and since Y = X\ + S\ + X2 + S2 + Z; 

d9TT ) follows from the definition of the function G(fv lt v , fv 2 ,v^) <f83b . likewise d92l ) follows from the definition of 
the function G max (Pi, P2) (HU), and since h(Si\Ui) < h(Si) and from the definition 



P i \ Vi ±E{x?\U i = Ui}, fori =1,2; 



(l93l) follows from Jensen's inequality since G max (P\, P2) is a concave function; (l94l follows from the input 
constraints where 

EX? = E Ut E{Xf\Ui =Ui}= E Vi P Aui < Pi, for i = 1,2. (95) 

The lemma follows since the upper bound d94t for the rate sum is now independent of U\ and U2, hence it also 
bound the single-letter region 71bsl(Pi, -Fb)- □ 



22 



Acknowledgment 

The authors wish to thank Ashish Khisti for earlier discussions on the binary case. The authors also would like 
to thank Uri Erez for helpful comments. 

References 

[1] M. Costa, "Writing on dirty paper," IEEE Trans. Information Theory, vol. IT-29, pp. 439-441, May 1983. 

[2] S. Gelfand and M. S. Pinsker, "Coding for channel with random parameters," Problemy Pered. Inform. (Problems of Inform. Trans. ), 

vol. 9, No. 1, pp. 19-31, 1980. 
[3] T. M. Cover and J. A. Thomas, Elements of Information Theory. New York: Wiley, 1991. 

[4] A. Somekh-Baruch, S. Shamai, and S. Verdu, "Cooperative encoding with asymmetric state information at the transmitters," in 
Proceedings 44th Annual Allerton Conference on Communication, Control, and Computing, Univ. of Illinois, Urbana, IL, USA, Sep. 
2006. 

[5] S. Kotagiri and J. N. Laneman, "Multiple access channels with state information known at some encoders," IEEE Trans. Information 

Theory, July 2006, submitted for publication. 
[6] S. A. Jafar, "Capacity with causal and non-causal side information - a unified view," IEEE Trans. Information Theory, vol. IT-52, pp. 

5468-5475, Dec. 2006. 

[7] K. Marton, "A coding theorem for the discrete memoryless broadcast channel," IEEE Trans. Information Theory, vol. IT-22, pp. 
374-377, May 1979. 

[8] U. Erez, S. Shamai, and R. Zamir, "Capacity and lattice strategies for canceling known interference," IEEE Trans. Information Theory, 

vol. IT-51, pp. 3820-3833, Nov. 2005. 
[9] R. Zamir, S. Shamai, and U. Erez, "Nested linear/lattice codes for structured multiterminal binning," IEEE Trans. Information Theory, 
vol. IT-48, pp. 1250-1276, June 2002. 
[10] T. Philosof, A. Khisti, U. Erez, and R. Zamir, "Lattice strategies for the dirty multiple access channel," in Proceedings of IEEE 

International Symposium on Information Theory, Nice, France, June 2007. 
[11] J. Korner and K. Marton, "How to encode the modulo-two sum of binary sources," IEEE Trans. Information Theory, vol. IT-25, pp. 
219-221, March 1979. 

[12] T. M. Cover and B. Gopinath, Open Problems in Communication and Computation. New York: Springer- Verlag, 1987. 
[13] I. Csiszar and J. Komer, Information Theory - Coding Theorems for Discrete Memoryless Systems. New York: Academic Press, 1981. 
[14] B. Nazer and M. Gastpar, "Computation over multiple-access channels," IEEE Trans. Information Theory, vol. IT-53, pp. 3498-3516, 
Oct. 2007. 

[15] D. Krithivasan and S. S. Pradhan, "Lattices for distributed source coding: Jointly Gaussian sources and reconstruction of a linear 

function," arXiv:cs.IT/0707.3461 VI. 
[16] A. Khisti, "Private communication." 

[17] R. G. Gallager, Information Theory and Reliable Communication. New York, N.Y.: Wiley, 1968. 

[18] G. Cohen, I. Honkala, S. Litsyn, and A. Lobstein, Covering Codes. Amsterdam, The Netherlands: North Holland Publishing, 1997. 
[19] R. Ahlswede and J. Korner, "Source coding with side information and a converse for the degraded broadcast channel," IEEE Trans. 

Information Theory, vol. 21, pp. 629-637, 1975. 
[20] A. Wyner, "On source coding with side information at the decoder," IEEE Trans. Information Theory, vol. IT-21, pp. 294-300, 1975. 
[21] T. Berger, Multiterminal Source Coding. New York: In G.Longo, editor, the Information Theory Approach to Communications, 

Springer- Verlag, 1977. 

[22] L. F. Wei and G. D. Forney, "Multidimensional constellation - part I: Introduction, figures of merit, and generalized cross constellations," 
vol. 7, pp. 877-892, Aug. 1989. 



23 



[23] A. Cohen and R. Zamir, "Entropy amplification property and the loss for writing on dirty paper," IEEE Trans. Information Theory, To 
appear, April 2008. 

[24] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge: Cambridge University Press, 2004. 



